Test-time (inference-time) scaling

See also Test time training.

Muennighoff2025s1 showed that by injecting “wait” token and prolonging the “thinking” process, you can improve the performance of LLMs.

The State of LLM Reasoning Model Inference by Sebastian Raschka